
107525539 

WO 2004/019653 ^ - , PCT/US2003/026998 



*C d £4 PEB 2005 



TITLE OF THE INVENTION 
PARA METRIC ARRAY MODULATION AND PROCESSING METHOD 

CROSS REFERENCE TO RELATED APPLICATIONS 

The present application incorporates by reference prior commonly owned U.S. 
Applications 60/185,235; 60/406,230; and 09/300,033. 



10 FIELD AND BACKGROUND OF THE INVENTION 

The parametric array uses the nonlinear response of a transmission medium 
such as air or water to convert or demodulate ultrasonic frequency waves into audio 
frequency waves of any audio frequency signals modulated onto the ultrasonic waves. 

15 This phenomenon is useful to direct a beam of ultra sound having audio modulated 
thereon to a specified region where it is demodulated in the medium and can be heard. 

In a typical case of a parametric array in air, the audible signal will typically be 
from voice, music or other normal audio frequency source. Prior to or subsequent to 
the conversion or modulation into the ultrasound frequency ranges, some form of signal 

20 processing is typically undertaken. This is undertaken to compensate for the non-flat 
frequency response of typical ultrasound transducers, the transducer nonlinearity, 
environmental conditions of temperature and humidity and the position of the listening 
recipient among other effects that prevent a faithful reproduction of the original sound to 
the listener. The response of air to ultrasound is also nonlinear and may need 

25 compensation prior to the actual ultrasound emission. 

The form of modulation typically employed provides a signal envelope on an 
ultrasonic carrier. The carrier and envelope have different response characteristics to 
the type of signal processing and the non-flat and nonlinear characteristics of the 
environment and devices used in this type of ultrasonic sound beaming from source to 

30 listener and a resulting lack of alignment. 
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SUMMARY OF THE INVENTION 



The present invention uses an envelope summed with the audio signal and an 
5 envelope detector to supply an adjusting offset to the source audio signal, such that the 
envelope of the audio signal, when added to the audio signal, is entirely positive (or 
entirely negative). When this is the case, a nonlinear preprocessing method can be 
applied (such as taking the square root, or other nonlinear function) accurately. In 
addition, residual sound generated by the envelope signal should be inaudible in the 

10 resulting demodulated beam. 

A preferred way to accomplish this would be to "look ahead" at the audio signal, 
to see what the (peak) values will be some time in the future, and begin adjusting the 
envelope signal well in advance of a change. Because a processing system cannot 
actually look into the future the invention either estimates the signal based on past 

15 knowledge or allow some small delay between the input and output signals. Using an 
audio signal delay, it is possible to anticipate the audio signal, and to change the 
envelope signal accordingly so that the sum conforms and remains positive. The result 
is an envelope signal which faithfully follows the peak levels of the audio signal, but 
changes only gradually (ensuring only very low-frequency residue). 

20 

DESCRIPTION OF THE DRAWING 



This and other features of the invention are more fully described in the detailed 
description below in conjunction with the Drawing of which: 
25 Figs. 1A - 1D are waveform diagrams on the summed audio and envelope useful 

in understanding the invention; 

Fig. 2 is a block diagram illustrating the practice of the invention; 
Fig. 3 is a block diagram illustrating signal processing providing compensation 
for characteristics of using ultrasound audio projection according to the invention; 
30 Figs. 4A - 4D illustrate signal processing associated with Fig. 3 circuitry; and 
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Figs. 5A - 5C illustrate further signal processing associated with Fig. 3 circuitry. 

DETAILED DESCRIPTION 

5 The generation of ultrasound from audible sound will in general require some 

step of modulation, which is simply a re-scaling of frequency. We can write this 
modulation step as: 

p(t) = M(t)sincot 

10 

Where M(t) can be termed the modulation envelope, and to is the carrier frequency. 
Basic parametric array theory predicts that, upon demodulation, the resulting audible 
sound q(t) is approximately proportional to the second derivative of the square of the 
modulation envelope: 

15 

q(t) oc d 2 /dP(M 2 (t)) 

This function is an approximation, but serves well to illustrate the ideas contained 
herein. 

20 

We can define an arbitrary preprocessing function P{z}, which accepts some 
signal (primarily) in the audio range as input, and outputs a processed signal suitable 
for modulation oat the output. Thus, we generally have M(t)=P{z}, where z is some 
function of g(t). Note that the algorithm P{x} may also accept inputs such as 
2 5 environmental condition, listener position, desired sound quality, etc. 

Early parametric arrays used simple AM modulation to generate the audible 
signal, using M(t)=(1+mg(t)), or P{z}=z, where g(t) is the audible signal to be 
reproduced (assumed to be normalized to unity magnitude), and m is the modulation 
depth, usually taken as one or slightly less than one. The resulting sound generated is: 

30 
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q(t) oc d 2 /dt*(M 2 (t)) = 2md 2 /dt 2 (g(t)) + m 2 d/dt 2 (g 2 (t)) 

We can see that the resulting sound contains the desired linear term g(t) (we will 
5 now omit the second derivative, it being a simple equalization step and simple to 
compensate by integrating g(t) twice). We also have a nonlinear term mg 2 (t), which 
corresponds to distortion. If we require m to be small, the distortion will be reduced, but 
the corresponding output signal will also be reduced by the same factor, which is 
undesirable. 

10 An improved method is to use P{z}=z1/2, where z=1+mg(t), leading to 

M(t)=(1+mg(t))1/2. Upon demodulation, the resulting audible signal is proportional to 
mg(t). While this is the result we seek, there are two drawbacks to this method. First, 
taking the square-root operation of a signal results in the generation of a substantial set 
of harmonics, which increase the required bandwidth of the ultrasonic transmission 

15 system. If the bandwidth of the transmission system is insufficient to reproduce the 
entire ultrasonic signal, distortion will result. This was investigated theoretically in [1] 
and experimentally in [2]. 

The preprocessing function P{z}=z1/2 is a reasonably effective method of 
preprocessing the audible signal at low modulation frequencies and low ultrasonic 

20 amplitude. However, to improve performance, P{z} should be altered to more 
accurately model the true nonlinear modulation function. The particular algorithm is 
described elsewhere (see referenced provisional), but we can generalize the function 
with a nonlinear piecewise polynomial function, and perhaps a linear filter. 

A major shortcoming of having the argument z=1+mg(t) is that when no audible 

25 sound is intended to be reproduced (g(t)=0), the modulation function M(t) is unity 
(M(t)=1). This means that the system is still outputting high levels of ultrasound, which 
is not being used to create audio, as the output signals is still p(t)=sincot 

To alleviate this latter shortcoming, it has been proposed to use a modulation 
envelope which contains an envelope follower. This is commonly implemented as a 

30 rectifier and low-pass filter. The detected envelope of the audio input signal is intended 
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to be a faithful follower of the amplitude of the input signal, although with some time 
delay. Adding this envelope to the audio signal can provide a suitable offset which 
keeps the signal positive, allowing an accurate preprocessing operation: 



5 



10 



z(t) = e(t) + g(t) 
M(t) = P{z} = (e(t) + g(t)) 
q(t) <x d 2 /dt 2 (e(t) + g(t)) 
q(t) oc d 2 /dP(M 2 (t)) 



1/2 



The audible signal g(t) (with second derivative omitted) is reproduced as before, 
and there is a residual term consisting of the second derivative of the audio envelope 
e(t). As long as this frequency is low (recall that it is the result of a low-pass filter), it 

15 should not reproduce substantial distortion components. In general, we wish the 
frequency of the envelope e(t) to be lower than about 1 00Hz. 

Fig. 1A illustrates the problem of this type of envelope detector in that the 
envelope curve 12 in following the audio signal at a cut off inthe vicinity of 100 Hz isn't a 
good peak follower and is at times below the audio signal. 

20 The main goal of the envelope detector is to supply an appropriate offset to the 

incoming audio signal, such that the envelope, when added to the audio signal, is 
entirely positive. When this is the case, a nonlinear preprocessing method P{z} can be 
applied (such as taking the square root, or other nonlinear function) accurately. In 
addition, residual sound generated by the envelope signal e(t) should be inaudible in 

25 the resulting demodulated beam. The slow changing function e(t) cannot accurately 
keep up with a generally fast-changing dynamic audio signal g(t). 

An elegant solution according to the invention is to "look ahead" at the audio 
signal, to see what the (peak) values will be some time in the future, and begin 
adjusting the envelope signal well in advance of a change. Because our processing 

30 system must be causal (we cannot actually look into the future), we may either guess at 
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the signal based on past knowledge, or, even easier, allow some small delay between 
the input and output signals. If the audio signal is delayed, there is opportunity to 
anticipate the signal, and begin to change the envelope signal accordingly. The result 
is an envelope signal which faithfully follows the peak levels of the audio signal, but 
5 changes only gradually (insuring only low-frequency residue). 

A block diagram showing this method employed in discrete time (as in a DSP) is 
shown in Fig. 2. As the processing my be digital or analog, hardware of software 
based, it is to be under stood that the circuit description applies as hardware modules 
of processing steps in the following description. 

10 

The input (audio) signal x[n], 16, which may have been processed in a processor 
18 (i.e. equalized, filtered to remove all low-frequency content, etc.) is split onto two 
paths 20 and 22. The signal on path 20 is first rectified in a rectifier module or step 
(assuming DSP) 24, or other envelope detection, to determine its magnitude. The peak 

15 value of this rectified signal is tacked in a module or step 26 over the previous M 
samples, represented by p[n]. This peak signal p[n] is then low pass filtered in a filter 
28, generally a very low frequency, and the result is the envelope e[n]. The raw signal, 
x[n] on path 22 is delayed in delay module or step 30 by N samples or a predetermined 
interval, which effectively corresponds to the settling time (or group delay) of the low 

20 pass filter 28, plus any other delay present in the signal path. This ensures that the 
envelope e[n] is properly aligned to the audio signal x[n]. Finally, the envelope signal 
and audio signal are summed in summer as shown. The result is an accurately offset 
signal, which is always positive, and is suitable for nonlinear preprocessing (such as 
square-rooting) through P{z} and modulation. 

2 5 As a variation, since we are primarily concerned with keeping x[n] positive, we 

need only concern ourselves with tracking the negative peaks of x[n]. Thus the 
absolute value or rectifier function could be replaced with an inverter (-x), and the 
peak detector, rather than locating the maximum of M previous samples of x[n], would 
locate and track the minimum (maximally negative) samples of x[n]. 
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Fig. 1B and Fig. 1C which is a magnification of a portion 34 in Fig 1B shows the 
effective of this processing where envelope and signal functions are brought into 
alignment. Fig. 1 D shows the ratio of the signal level to the noise level for such an 
approach as a function of frequency and illustrates acceptable levels of S/N values. 
5 Fig. 3 illustrates a complete system from audio input to sound wave output and 

applies generally to either hardware or software realization. The x(n) + e(n) output of 
the Fig. 2 processing is optionally applied to an upsampling and low pass filtering 
module or step 40 which improves the available bandwidth for use in a subsequent 
preprocessing module or step 42 characterized by the function P(z). The 
10 preprocessing function, more fully described below, is generally nonlinear and functions 
at least in part to compensate for the nonlinearities in the ultrasound generation and 
demodulation functions. The sampling rate of function 40 is preferably sufficiently high 
so that the harmonics inherent in the non linear processing do not alias and create 
unintended distortion. 

15 The preprocessed signal may subsequently be upsampled and low pass filtered 

in module or step 44 after which it is modulated in modulator or step 46 onto an 
ultrasonic frequency carrier from carrier generator or step 48. 

The modulated signal is then optionally post processed in a module or step 50 
which may include equalization to compensate for frequency dependent variations in 

20 the transfer functions of subsequent ultrasonic amplifiers 52 and transducers 54 or 
nonlinear processing to compensate for non linear transfer functions in these same 
elements 52 and 54. Other processing may be added here to accommodate 
environmental air characteristics or phased array phasing. Alternatively the modulated 
and/or post processed signal could be converted into pulse width modulated waveform 

25 or the like for driving amplifiers 52 where they are switching amplifiers. 

The preprocessing module or step 42 in the simplest form consistent with 
providing a low distortion audio demodulated signal has a square root function. 
Because the nonlinear nature of the preprocessing generates harmonics and because 
the subsequent amplification and transduction functions have limited bandwidth, other 

30 approaches such s a polynomial expansion of the type P(z) = ao + a-|Z + a2Z 2 . . the 
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a's being functions of the particular environmental and processing system 
characteristics. More sophisticated processing could be a polynomial with coefficients 
that are non zero for specific value ranges or other forms of series. 

The processing of a monotone, shown in Fig. 4A f at the point of preprocessing to 
5 give E(t) is shown in Fig. 4B. This creates singularities near or at the zero crossing 
which contributes to a large bandwidth due to the high values of the derivatives at the 
abrupt reversal. One approach to curing this problem is to modulate E(t) with a bipolar 
squarewave, F(t) f as shown in Fig. 4C which produces the low bandwidth signal, 
E(t)F(t) of Fig. 4D. 

10 This approach can be modified for use with real signals of unpredictable 

frequency content by reversing the bipolar signal F(t) of based on an estimating 
function in the preprocessor 42 causing the polarity reversal when one of the following 
criteria are met: 

i. E(t) proximity to zero; 

15 ii. Magnitude of derivatives of E(t) are high (either first or higher order 

derivatives); 

iii. E'(t) zero crossing from negative to positive (i.e. E' is zero while E" is 
positive); 

iv. A short-time power spectrum analysis is made and the result shows a high 
20 bandwidth. 

Fig. 5A shows the operation when the signal envelope, E(t), nears zero showing 
how E(t)F(t) functions to provide a smoothing function. In the case where the polarity 
reversal occurs when the envelope signal E(t) is near but not at the zero crossing as 
25 shown in Fig. 5B, a discontinuity with consequent perturbances can occur. To rectify 
this situation, preprocessor 42 can add in a spline segment as shown in Figs. 5B and 
5C to produce the smooth transition from E(t) to E(t)F(t). 

The features of the can be realized in alternative, equivalent ways. For example, 
either changing the carrier level directly or the offset of the audio signal are 
30 mathematically substantially equivalent and thus functionally equivalent. 
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